Kirsten Weand 


We just are waiting just another moment here. For someone to join, and we will get started. So 
please allow one more minute, and we will kick. Start the meeting and thanks again for joining us 
here today. 


I'm right. 
Good morning, everybody. Thank you for joining us here today. 


it's a pleasure to welcome you all learn to learn more about the role that Al can play in advancing 
global health and what challenges and risks we need to be aware of and address to be able to 
seize those opportunities. 


My name is Kirsten Wand, and I'm co-hosting today's session on behalf of the US. Department of 
State's Bureau for Global Health Security and diplomacy in partnership with my colleagues at the 
Gates Foundation 


today's session will begin with an overview on the basics of Al, including an overview on the 
different types of Al technologies, to lay the foundation for a more in-depth session. That'll be 
followed by 3 presentations illustrating how Al is being used in the global health context with a 
ample amount of time at the end. For QA. We hope that this listening session will start to spur 
discussions about Al and global health 


to begin the meeting, though, | want to provide a few remarks. Just about what the State 
Department in particular, the Bureau for Global Health Security and diplomacy is doing about Al, 
and how we're looking at this issue. 


Given the broad reach of Al and the potential of impacts of Al on our global work. Ghsd and the 
Gates Foundation are co-hosting a series of educational sessions on Al and its relevance to global 
health Ghsd is actively assessing the Al landscape and considering how we can guide and 
empower our workforce and programs to interact with the technology. 


Today's session is the second in a series of listening sessions around Al and global health and is 
meant as an introduction for Global Health professionals. 


This session is intended to start our collective learning and dialogue on the opportunities and 
challenges to using Al in a global health context. 


the State Department has led efforts centered around international approaches and engagement to 
promote the safe, secure, and trustworthy use of Al and set the norms for international governance 
for Al development and use. 


we will continue to follow the Department's lead while also exploring Ghsd's opportunities and 
challenges to advance productivity around Al. To do so, we must 1st mitigate risks of misuse and 


harm. 


Al technologies are powerful tools that can advance a wide range of global health and health 
security efforts to harness the benefits of Al most effectively. Ghsd and the Gates Foundation have 
held significant conversations regarding areas where Al can have a meaningful impact on 
programs with little to no risk to pepfar beneficiaries and populations, including to key and priority 
populations. 


We hope that Al can have a transformative impact on the arc of the epidemic. 


Pepfar has consistently led with the adaptation of new tools, and Al presents a great opportunity to 
be at the frontier of this exciting technology. 


Thank you for joining us today for this important game-changing topic, and we hope you enjoy and 
learn from today's session, and that you will continue to engage with us in future sessions. We 
welcome your feedback on the series, and we continue to engage on the topic and plan for future 
sessions. 


So with that today. 


we are gonna 1st turn it over to dia to Guam who will introduce our colleagues at autore for an 
introduction on Al. 


Guillaume Chabot-Couture 00:05:23 


Thank you so much, Kirsten. For for these opening words as she said, my name is Guillaume 
Shabbat, sir. I'm the deputy director at the Institute for Disease modeling at the Gates Foundation. | 
lead an Al research team as well as a team focused on modeling and analyzing 


disease data for polio malaria as well as other event vaccine preventable diseases. And I've been 
working global health for 14 years. Now, it's my distinct pleasure to introduce Audrey. Audrey is a 
digital health nonprofit 


Co headquarters in South Africa. In the Us. Dedicated to Advancing health equity and resource 
limiting 


limited communities through health boss Al toolkit by pioneering pioneering, collaborative Al 
powered digital solutions. Autary ensures their quality, safety and performance over time while 
supporting Al capacity building for local innovators and decision makers. Our 2 speakers today will 
be Dino and and Sarah. 


do you know, is the CEO of autore he's a public health and emergency medicine physician and a 
social interpreter with nearly 20 years of experience, improving health outcomes and access across 
the developing world. He's South African in origin. He's a nonprofit leader and an implementation 
scientist. He's led autore for the past 3 years and under his leadership 


others help us. Al is evolved into a vital, interactive toolkit utilized in 11 countries. He, previously 
Dino, served as CEO of Oram Institute and the CEO of the center for HIV Aids Prevention studies 
chaps, which is a a Gates foundation funded 


organization. 


Sarah is the chief product officer at autore. She has had the product design and research functions 
for an early 4 years 


she has spearheaded product strategy and platform investments in autaries, health boss Al and 
interactive cool toolkit that is, Al verified over 550,000 rapid tests for malaria. HIV and Covid in the 
field the toolkit delivers value in various use cases, including disease surveillance, supportive 
supervision in scaled community health workers 


and automating pharmacy incentive program audits. 


They'll they have HIV testing program in Kenya and South Africa. She has nearly 20 years of 
experience at organizations like Microsoft Meta expedia Sarah has a proving track record, and 
she's led very big projects, and so it's my distinct pleasure to introduce them and to hand it over to 
you both for this introduction on Al. Thank you so much. 


Sarah Morris | Audere 00:08:23 


Thank you so much, Guian, appreciate it. We just want to start by saying, Thank you to the whole 
global health security and diplomacy team and to the Gates Foundation. We're really excited to be 
here and spend 25 min demystifying. Al. It is a bold challenge, but we are up for it. So let's jump in 


alright. So I'm Sarah, as Dion said. And | lead our product team at outeray today. | could not be 
more proud to be here talking to this group of trailblazers who are thinking about applications for Al 


to extend health equity. 


Do you know, are you able to come off mute. 


Dino Rech 00:09:03 
Thanks, Sarah, and I'm Dino Rick the CEO at Adere. 


| bring my expertise as a South African public health physician, nonprofit leader and 
implementation science to our work. 


3 years ago | knew very little about Al. 


| know that one of the biggest structural barriers to realizing the benefits Al can bring to healthcare 
is creating a shared base of understanding. l'm proud that we've got a bit of time to start that 
journey together today. 


Sarah Morris | Audere 00:09:30 
Alright, so | would like to start with an exercise. 
What is this? 
Please enter your very 1st reaction into the chat, and | promise. This is not a trick question. 
and let's give people just a moment to do just that. 
Alright. 


So let me just say this is not a trick question. Some of you may know this is an HIV self test. So 
let's check in with Al 


and see what Al thinks. 

So | asked a large language model or an Llm. What it thinks this is. And the 1st thing that we got 
back is a pregnancy test, the original scaled self test. But let's try again. We know that's not the 
answer. 


So if we click again, all right, we're getting a little bit warmer. Let's try a couple more Lims. 


all right, very interesting indeed. So let's try. Let's shift and try a different type of Al or computer 
vision, which is a type of Al that was trained to read rapid tests for a use case like disease 
surveillance. So let's see. 


alright test type. We get the brand. We get a result that's pretty cool. Well, let's try a couple more 
times, just like we did with Llms. 


K. 
We get the correct answer every time. 


So do you know. 


Dino Rech 00:11:14 
But sure 
this activity isn't meant to point out Al errors which happen in all types of Al. 
But instead, let's talk about the different types of Al, like language models and computer vision, and 


the importance of leveraging Al design for your use case in order to realize value and to build trust 
in Al. 


Today we will cover prominent types of Al. How Al learns the value, added roles. It can play and 
explore the pillars of trusted Al, such as reproducibility, which we just saw demonstrated. 


We'll also look at how to build increased capacity fora R 
note that all these examples of our raw Al outputs not end user products. 
After the Intro, you'll hear from organizations developing or leveraging Al tools in their products 
across a few different types of Al. 
Sarah Morris | Audere 00:12:07 


Alright. So let's start with our 1st caution. We said we were cautious optimists, but we'll start with 
the caution first.st So just want to say that it's tempting to think that Al can solve everything that we 
want to do, but not every problem is a nail that needs an Al hammer today is meant to be a 
baseline for what Al is, and important considerations if you choose to leverage Al. 


So, as we all know, Al is not a silver bullet, and there are always trade offs between costs such as 
financial costs. 


access, privacy, and value both to the health system and to the end users of products that are 
powered by Al. 


Now these trade offs are not static, but early in the life of new technology, such as, like Llms, 
they're always steeper, and over time the trade offs may become less steep, and the value add is 
more clear. As the technology evolves 


alright. So with our caution out of the way. Let's talk about trust. 


Dino Rech 00:13:10 
So trust is built over time. 


it leads to the adoption and further innovation. So let's collect the trust, baseline for Al. Please get 
out your phones. Or if you're on a PC open a new bras, a new browser tab. 


let's think about what activities we would only trust humans with versus activities. We would only 
trust A. R. With 


scan the QR code or go to the link shown, and enter as many answers as you can. 


Sarah Morris | Audere 00:13:41 


Alright. So we've got a 1st one here to get you started hopefully. Folks can scan this QR. Code, 
and | just wanna say there's no wrong answers here. As you enter these, we don't know your name 


the folks at gates and the global health team will not see any of your your names are tied to these. 


Alright. So a human. Do you know, as a physician surgery must be a really good one to see there. 


Dino Rech 00:14:11 


The drew comes up a lot and makes a lot of sense. 


Sarah Morris | Audere 00:14:14 
Alright. | like it. 


A human for praying, cutting hair. That's a really good one. 


Dino Rech 00:14:22 
No one trusted 


driving a car. Al just yet seems. 


Sarah Morris | Audere 00:14:26 
Love lots of people entering love. 
These are good. 
alright! A lot of things with empathy, a lot of things having to do with life and death. 
Alright. 
So let's shift to Al 
and think about all of the types of Al that you know about. 


Maybe some of the examples we just provided a moment ago. What would you only trust Al with 
and not a human? 


I'm thinking that prey might have been left over from humans. But otherwise we've got a whole new 
use case for Al here that | have not thought about before. Do you know, what do you think. 


Dino Rech 00:15:11 


That is an interesting one for chat. 


| think a lot of the day driven ones make a lot of sense, and we see those often. 


Sarah Morris | Audere 00:15:20 


Yeah, absolutely. Lots of large scale analysis, computation, programming. Nothing is getting some 
good air time here. 


Dino Rech 00:15:29 


You shouldn't. 


Sarah Morris | Audere 00:15:32 


Alright. We want everyone to think about the answers to these questions as we go through the 
session today. And you see examples of how Al is impacting healthcare delivery. And at the end we 
want you all to think about what may have shifted for you, if anything 


alright. So we're gonna do a very quick history lesson. What is artificial intelligence, anyway? And 
where did it come from? 


Well, in the beginning, of course, there was Ada Lovelace, the 1st computer programmer who 
paved the way for future generations, and all of the women in computing, like myself. 


so fast forward to 1950, when Alan Turing introduced the Turing test to measure computer 
intelligence. In 1956 Mccarthy coined the term artificial intelligence, but notably during this time 
there was a cost of 200,000 Us. Dollars a month 


to lease a computer. And this definitely restricted access to only the most prestigious universities, 
big tech companies, and that limited the diversity of thought in the room and in the field. 


Now, a little bit later, Elaine Rich published one of the 1st Al textbooks, and in 1985 a guy, you 
might all Know Bill Gates aimed to have a personal computer in every home. Now this reduced 
some of the cost barriers, but it wasn't until 2010 when sub 100 pocket size computers or 
smartphones brought mobile 1st Internet access to low and middle income countries, giving us our 
very 1st important step towards equity. 


Dino Rech 00:17:11 
So what are the various types of Al 


that abbreviated timeline that Sarah took us through stopped in 2010. We obviously missed a lot 
like, for example, language models which are exciting with so much potential. 


But don't worry. We'll get there. 

But first, st let's briefly cover the various types of Al 

computer vision 

really represents Al that sees and recognizes visual patterns. 

automated speech, recognition Al that hears and recognizes speech or acoustic signatures. 


language models and natural language processing Al that understands language patterns and 
attempts to converse like a human. 


And then big data analytics. Al that synthesizes vast amounts of data into information and insights 
or predictions. 


Using multiple Al tools together, is called multimodal Al, which produces richer human-like outputs 
and is moving towards this notion of artificial general intelligence, which is Al that learns and 
applies knowledge just like a human. 


Sarah Morris | Audere 00:18:18 
So let's categorize the roles that Al can play, especially in healthcare. 
In each we see that Al is additive, not replacing humans, but instead partnering with them. 


Al can collaboratively upskill humans to support decision making such as asking someone to take 
another look if they miss a faint, positive test line. 


Al can provide supportive supervision like scale program monitoring to identify who might need 
additional training. 


Al can prioritize concentrating limited resources and effort in the right places. For example, in one 
program in South Africa, attitudes and behaviors shared with an Al companion will aid in the 
prediction of client vulnerability to HIV. That prediction will help us focus precious clinician time and 
outreach to the most vulnerable clients. 


And finally, Al can help automate tasks to reduce human effort and cost. Now we at Outeray, we 
work with a pharmacy incentive program in East Africa who previously had humans looking at 
every single rapid test, image for quality control. Al now automates much of that process, enabling 
them to cost effectively scale up their services. 


So let's return to our notion of trust. How can we build trusted Al to reliably support humans, 
especially in healthcare situations. Well, I'll touch on the most important pillars briefly, before we 
look at some examples. 


So 1st is explainability or transparency in how the model works 
next is reproducibility or the ability to achieve the same result reliably. Just as we saw earlier 
observability or the ability to measure correctness by analyzing states with inputs and outputs 


privacy and security. One of my favorites for health systems, workers, patients and importantly 
preserving these especially for users from vulnerable populations. 


bias and mitigation, or ensuring that training and validation data is representative of the population 
and is the right data for the problem that you're solving. 


and finally, empathy or centering users in the design process to ensure that the entire product, 
including Al, leads to the right outcomes that you're looking for. 


Oh, Tina, we can't hear you. You're muted. 


Dino Rech 00:20:47 
Sorry about that. So with these key pillars in mind, let's talk about how Al learns. 
Imagine a small baby lion, naive and eager. It only knows what it's seen so far. 


but we can train it. We can show it data and images of what is correct and what's incorrect. And 
after enough examples. 


it becomes a mature with the right context to make decisions on its own, recognize patterns and 
distinguish differences. 


But if we feed it too much in one direction, it may become biased or recognize patterns. We didn't 
intend for it to pick up. 


Sarah Morris | Audere 00:21:25 


So let's make this real and bring together the pillars. And how Al learns. So remember the 
computer vision example we mentioned earlier for reading, rapid diagnostic test results. 


So these models are trained on images of tests that are activated and captured by local experts in 
South Africa, real images from the field and synthetic and augmented data augment our models to 
mitigate bias in the data. 


It's important to ensure that no personal information is included. Models are static and do not 
automatically adjust. In real time. 


When it reads an image, it returns, the lines identified. It flags. Image quality issues for 


explainability 


results are a hundred percent reproducible, and a accuracy is verified pre-launch and monitored 
over time to prevent drift. 


Empathy is infused into the user experience. Al tools are integrated into products where they can 
be leveraged to guide workflows, nudges, or trigger. Other events. 


Dino Rech 00:22:29 
Let's compare what we just saw with computer vision to how LIms learn. 


They learn from all the available dates on the Internet. This means social media, Wikipedia, the 
World health organization, memos, local health guidance. 


They're trained on all the good days humans have had. But they're also trained on all the bad days 
humans have ever had, including misinformation, hate, speech, and more. 


The output is Al that mimics humans who are empathetic, responsive, sometimes explainable. But 
on the flip side. 


We also change our minds often, don't repeat everything the same way. We are fallible, confused, 
and at times incredibly biased. Whether we are aware of it or not. 


Sarah Morris | Audere 00:23:15 


True, and with LIms. Please keep in mind this is a nascent space. There's a ton of work underway 
to extend the use of Lims. So this example here is from one of our programs in South Africa, in 
partnership with a research firm called in Leila. 


and we use a variety of large language, models to guide empathetic stigma. Free HIV prevention. 
Conversations. Now in the Lim. Generated summary of a conversation that a client had with an Al 
companion. You can see that the model is flagging, that it made a mistake. 


Now, this is a really useful capability to harness in real time for monitoring to identify deficits and 
data or potentially harmful situations which might need immediate attention. 


Dino Rech 00:24:02 
So we told we told the language model we were in South Africa. 
Why didn't it know which self tests were available? 


Well, back to our baby lion. 


What would happen if the baby lion only saw examples of vegetarian lions, lots, and lots of 
examples. 


while it is accurate that lions can eat fruits and vegetables. It's unclear. For how long a vegetarian 
lion would survive in the wild if it was no longer an apex predator. 


The key takeaway 


is that if your training data is biased, such as having an abundance of data from countries with 
broad access to the Internet prior to 2,010, and you intend to use your model in countries who's 
only meaningfully gained access after 2010. The skewing, their contribution to the available data on 
the Internet may lead to confusing and many times inaccurate results. 


Sarah Morris | Audere 00:24:52 


All right. So that kind of confuses me because | see news stories every day about Llm's acing Med 
school entrance exams, outperforming humans. Well, let's pull back just a little bit and talk about 2 
axes from our evaluation framework. 


accuracy and viability. So first, st the what the accuracy of the information returned or the facts. 


Now with computer vision, it's relatively straightforward to measure. These are qualitative. Visually 
red tests. 


We can have a panel of trained humans provide ground truth and compare the Al output to 
determine how accurate it is. The what is clear. 


Now with Llms we can ask a question like, what is Pepfar. We could evaluate, evaluate similarly to 
our computer vision example. And we know that this is the correct answer. 


But we really need to consider the use case and the end user. Who will see that result? We call this 
viability, or how the Al outputs are delivered. 


Now, once we understand who's using the product and their personal stake in the outcome. We 
aim to make that information accessible 


here with the delivery of an HIV self test result, we deeply consider the gravity of the information 
being shared, and empathy is crucial for users to trust that test result. 


Now, Llms, take this to a whole new level 


for our Pepfar example. The answer on the left is appropriate for an audience like this, but we can 
tailor the response to maximize comprehension and build trust with mimic tone, style, language for 
various audiences, and hopefully, every time you ask that question you get the same factual, what 
but variation in the how? Just like you would get from real humans. 


Dino Rech 00:26:48 
But prompt engineering alone isn't sufficient 


in the research project in South Africa. We mentioned earlier. A teen girl asked if she could get HIV 
from sharing soap with her sister, who is living with HIV 


6 months ago an LIm. Replied with a long, confusing answer, adding doubt and fear for the teen girl 


just over a few short months ago, and likely using the data we were feeding at during the project. 
The language model evolved to present this new, concise answer. 


no, you cannot get HIV from a bar of soap 
accurate. 


however, through prompt engineering, we can ask it to mimic the tone and style of an adolescent 
girl. 


Again the response was factually accurate. 


but not completely viable. Calling HIV at 1 point scary which could exacerbate strict stigma. 


Sarah Morris | Audere 00:27:40 


So in a longer version of this presentation we cover retrieval, augmented generation, and fine 
tuning for language models to help improve your chances of receiving accurate, viable answers. 
But we've spent a ton of time so far on language models. So let's talk about the active learning 
cycle for all types of Al, of which you'll hear more about in a bit. 


Once we have a robust evaluation framework for our Al, and it meets our established benchmarks, 
and it's now being used in the field. How do we make sure that it stays accurate and viable? 


Now we may identify a deficit or a case where the Al isn't performing well, we may need to train our 
model with new data, but it shouldn't forget all of the previous knowledge that it had. Now, this is 
kind of like clinicians learning new standards of care. 


So we need to augment our validation data set to cover the new data 


automating. This process allows for continuous sampling of Al outputs at scale for human review 
similar to regression testing in software development. 


Now assessing accuracy and viability. Each requires specific skills 


for computer vision labeling human annotators are trained to interpret test results accurately, 
especially in adversarial conditions that are regularly seen in the field 


for language models. We engage local communities for viability, assessment and clinicians for 
accuracy. 


designing systems with future scale and automation for cost. Effectiveness in mind is really 
important. 


nothing about us without us. 


And this is a phrase born out of disability rights. It's a slogan meant to communicate the idea that 
no policy should be decided by any representative without full and direct participation of the 
individuals affected by that policy. 


Now this has been adopted across many sectors, and it's become quite a mantra for human 
centered design. This applies directly to local contacts, lived experience and designing with 
communities rather than for them. 


Trevor Noah recently spoke at a nonprofit Leaders Conference we attended, and he said that some 
of the most innovative ideas come from places where innovation is a necessity, not a luxury. 


Absolutely. The communities we serve should be in the room, but it's really hard to figure out how 
to do that. 


But given our limited time here today, I'm going to give you all a little bit of homework. 


There are 2 toolkits that we really like, and they have guides on effective co-creation design bias, 
reduction strategies and more. And these are just a couple to get you started. 


Dino Rech 00:30:26 


Okay. So we've covered a lot of ground in a very short period of time. We started by saying, We're 
cautious optimists. So hitting the caution. If we want to turn the optimism into reality, we've got work 
to do. 


If you're at the World Health Assembly recently, or really any conference in the last year you likely 
heard or saw someone talking about Al structural barriers. There are many too many to cover 
today. 


But let's think about some of the individual, some of them that individuals or smaller organizations 
can actually impact and plan for in their work 1st training and skills which include everything from 
Ministry of Health understanding of Al to health work and end user education on data use and 
rights. 


Second, infrastructure needs, including mobile devices for access to AR power tools, computers for 
system monitoring and child service availability and data hosting restrictions in various countries. 


3, rd planning for power and Internet access is essential. Areas are often essential areas which are 
often overlooked until deployment planning and finally system use and maintenance are crucial 
balancing, the cost of paid services against local hosting, solutions and planning, monitoring and 
maintenance. 


Let's bring this back to semi reality before we wrap up. Let's walk through a quick, potential, patient 
journey, bringing together the many forms of Al. We discussed today. 


Imagine a community health worker named Kanya, who has a low-end smartphone in her pocket 
when visiting patients in her community. She could begin a session by scanning a fingerprint to 
identify the patient. 


This helps Kanya access the previous visit records, and any new conversation the patient may 
have had with her on Device Al health companion. 


With this information Kanya begins a new session where the history and data are recorded 
and populated, using voice recognition through her phone, speak and acoustic Al. 


Kanya learns the patient has had a high-risk sexual encounter and is concerned about HIV and 
other Sdrs. 


She's able to offer the patient both testing options in person or a multiplex HIV. Str. Self kit, to run 
at home. The patient elects to test in the privacy of a home and is provided with a link to a self-test 
language model and computer vision powered companion to assist her. 


She successfully tests at home and captures her results with her phone's camera. Luckily both the 
HIV and Str test results are found to be negative by the AR computer vision. 


Kanye is linked to a remote clinician. Kanye's patient is linked to a remote clinician via Telehealth, 
who has access to the AR supported images and results as well as relevant information. The 
language model gathered. 


He's able to talk to the patient, counsel her, and prescribe you the pepper. Prep. 


All the relevant data with the patient's consent is connected to a Ministry of Health Information 
system where big data models run to analyze trends and patterns. The Moh picks up a spike in 
testing and HIV positive from patients in that geography and is able to action a targeted education 
and testing campaign. 


Sarah Morris | Audere 00:33:24 


That was a great walkthrough and a vision that | think we all want to see come to life. So we made 
it before we wrap up and hear from others in this space to learn more about different products 
leveraging Al and their used cases. Hopefully, we've got the chat open. Go ahead and drop in the 


chat. How you're feeling at the end of the session today, are you? Cautious, optimistic? Something 
else entirely and hopefully, any questions you have will be able to get to during this session. 


Just wanna say thank you. And Dino final words over to you. 


Dino Rech 00:34:00 


Thanks very much, and and back to the team. We hope that was informative. 


Sarah Morris | Audere 00:34:05 
Thank you. 
Kirsten Weand 00:34:06 


Great. Thank you so much, Dino and Sarah. That was a great presentation. It's really informative 
and helpful in understanding the different types of Al technologies, the benefits and the challenges 
of using those systems and how to ensure the safety, security and trustworthiness which is 
particularly important in the context of global health. So thank you for that foundational information. 


I'd like to take just a brief moment to welcome Ambassador Dr. John Nakangesan, to the meeting. 
He's able to join us today. Ambassador Nakangasan currently serves as Us. Global Aids 
Coordinator and the Senior Bureau official for the Bureau for Global Health Security and diplomacy 


Ambassador. It's a great honor to have you here today. I'd like it to turn it over to you for a few more 
before we dive into the panel discussion over to you. 


Ambassador John Nkengasong 00:34:50 
No, thank thank you so much. What a great presentation! | think at times 


challenges work better for your own one's own good. | think | was supposed to give opening 
remarks and maybe disappear, but because of another commitment which the leadership at State 
Department. | came in later, and was able to stay and enjoy this presentation. 


So let me just keep my comments short. | think we really looking forward to strong partnership 
between Pepford and the Gates Foundation to advance Al in in in global Hurd, and very specifically 
in struggle against HIV is | did touch on some of those areas, including rapid testing, and also prep. 
Take up of prep. 


| think that that's all great. You in your presentation you touch on some of the things that | wanted 
to highlight, which is as we go down this route to look at policy issues to create that community 
trust and community awareness which we require that we start thinking very early about policy 
engagements. 


Secondly, looking at what we do with training and skills mobilizing people and making sure that we, 
the community, understands what this is all about. That will also lead to the empathy and trust that 
you indicated translation. 


translating some of the powers of of Al into, and making sure that we also understand that this is 
not a magic bullet and to solve all our problem. But it probably provides a unique opportunity to 
take us to the the next level 


impact. And lastly, structural dates. | say that because with excitement and personally excited with 
this, and | can't wait to see how much we can already. pilot some of these concepts in in fighting 
HIV Aids, in a a country like South Africa or any other hybrid country. But to do that you need 
champions, we need to begin to develop those champions on the continent of Africa and 
elsewhere, that collaboration would take us with. 


There are some very specific conveners that we should gravitate towards on the Continent. A group 
like Afro champions based in Ghana is very good at convening, | think, rallying around them and 
leveraging their convening power and ability to reach out to both both the policy levels, the political 
level and community level is what is we should be doing like yesterday, or should be doing like 
today to preparing ourselves for for the future. 


looking at groups that are the the penetrating power. 


like the Afro champions group in in Ghana and in other places is exactly the kind of networks that 
we should be building. The key word. Here is building networks, networks that would drive this 
process forward, and speedily this nothing that we have to wait for a trickle down access this is 
very powerful. What we are presenting here. And | truly look forward to working with you. To driving 
this forward to create impact in the fight against HIV Aids. So thank you. So much for inviting me to 
be part of this dialogue. 


Kirsten Weand 00:38:06 


Great. Thank you so much, ambassador, and couldn't agree more on those sentiments. And | think, 
talking about what some of these opportunities are, we're gonna be having a panel session. Next, 
diving into a couple of different use cases where Al and machine learning are being used. To 
address some of those challenges that we see in global health, particularly in combating HIV Aids 
and strengthening our responses to that. 


So 1st up, I'm gonna introduce the panelists discussions here for today, and then we'll go through 
one by one and hear from them before we turn to a Q&A session at the end. Just a note for 


everybody during these sessions. Please enter your questions into the QA. And so we can address 
those either written or verbally. When we come to the Q 


sessions. So please put those in the QA. Section, and you can find the tab here at the bottom of 
your screen. 


So 1st up we'll have Jonathan Friedman. Jonathan is the director of Data Science at Palladium, 
and oversees applications of machine learning and artificial intelligence to improve HIV. Care and 
treatment and the development of generative Al models, including an HIV informational Chatbot. 


He's going to speak with us about about Palladium's approach to use machine learning to predict 
treatment interruption for HIV Aids. 


After Jonathan, we'll welcome Beth Jeffrey, who will present on work that Damagi is doing to 
leverage large language models to address global health inequities. Beth is Demaggy's senior 
director for global strategy partnerships and works alongside demagi's CEO and chief strategy 
officer to drive investment in their newest initiatives through strategic partnerships. 


Beth also supports Demagu's efforts to expand equitable and responsible use of Al within the 
organization and beyond. And our 3rd panelist will be William Wu. He is the chief executive officer 
at quantitative engineering design or Qed. 


and will present on their efforts to scale data collection using computer vision. And Al. William 
directs Qed's mission to build pragmatic technologies for strengthening global health and food 
security. Informed by working and living in sub-saharan Africa. Dr. Woo's team has built data 
systems and Al used by thousands of medical, clinical and agricultural agencies across 23 
countries in the global South. 


With that I'll 1st turn it over to Jonathan for his presentation. Jonathan, take it away. 


Jonathan Friedman 00:40:32 
Thank you very much. Kristen 
sale. 


Can | confirm that you can see my slides. 


Kirsten Weand 00:40:48 


We can see them. Thank you. 


Jonathan Friedman 00:40:50 


Thank you. So like Kirsten mentioned, my name is Jonathan Friedman. I'm the Director of Data 
Science at Palladium. I'm gonna speak about some of our work in machine learning specifically to 
predict interruption and treatment. 


so first, st I'll give a broad overview of the work we do at cladium in machine learning and artificial 
intelligence, and specifically through a project we support the Kenya Health Management 
Information Systems Project. We'll take a look at the current situation of treatment interruption in 
Kenya, how we can use machine learning to prevent treatment interruption, the results we're 
seeing thus far and where we intend to go next 


at Palladium, we implement machine learning and artificial intelligence. Primarily through 2 Pepfar 
funded projects, the data for implementation or datify project implemented by Usaid and the Kenya 
Hmas Project, implemented by CDC. Our work falls into 3 3 buckets, one including above site 
applications such as those focused on data quality and Population level trends. 


a a second focused on patient centered machine learning models integrated with electronic 
medical record systems and deployed as real time decision support tools and 3rd applications of 
generative Al large language models, including a recently deployed HIV and informational Chatbot 
called Nishuri and text to code applications in Kenya. 


I'm presenting today on work that we do as part of the Kenya Hms. Project. Kenya, Hms. Is the 3rd 
in a series of CDC health information systems, projects implemented with pleadum. The goal of the 
project is to support the Ministry of Health, County Health management teams 


and Delivery service partners to adopt and scale innovative health information systems, a focus of 
the current phase of the project is providing person-centered clinical physician support. | should 
say that I'm presenting today on behalf of a wonderful team of data scientists, engineers, program 
implementation colleagues both at playdium and CDC, | want to personally call out the leadership 
from our CDC colleagues, Doctor Davis, Kamanga Doctor Ken, massamaro, Doctor Thomas 
Achiha, and pleading colleagues, Doctor Jacob Odiambo Anteg, who leads the engineering team 
and Benedette Otieno. 


My focus today will be machine learning for patient-centered care. We have 4 patient-centered 
models in different stages of development. 


one, the 1st model we developed and deployed nationally across Kenya predicts the risk that 
someone presenting at a health clinic who does not know their HIV status to predict their whether 
they are the probability that they would test positive for HIV, and should be referred for testing. 


Second is a model that takes in information from a patient's longitudinal record, and predicts the 
probability that he or she will interrupt treatment defined as being 28 days late to a scheduled 
appointment. A 3rd model seeks to predict the probability that patients are virally non-suppressed, 
and a 4th is looking at risk around non-communicable diseases specifically suspected 
hypertension. 


So why are we talking about treatment interruption or iit well, treatment, interruption slows progress 
towards epidemic control as patients who interrupt treatment are a greater risk of negative health 
outcomes and of spreading HIV to others. Many HIV programs struggle to intervene with patients 
before they experience treatment, interruption, and many interventions go into effect only after 
patients are iit, which is called delayed intervention. Currently, in Kenya each patient receives the 
same package of interventions before their appointment. 


Only after patients are late do we make additional interventions, such as default or tracing, 
exemplifying delayed intervention. 


In Kenya there are currently around 1.3 million patients on HIV treatment of these, around 70%. 
The 2 green bars on the far left come to their appointments on time. 


That's wisdom. 


Sorry approximately. Another 20% come late, but return to treatment within 30 days, and so are not 
designated as having experienced interruption in treatment around another 6 to 7% are late by 
more than 30 days, and so are classified as lit. About half of these return to treatment at some 
point and about half do not. 


The focus of this activity is the 2 categories of patients that missed their appointments. Can we 
prioritize these patients for intervention before their scheduled appointments and try to move them 
into the far left bucket of kept appointment. 


So, putting these last few slides together, the theory of change for this work is as follows, if we can 
use machine learning to predict which patients are at greatest risk of experiencing treatment, 
interruption 


and clinicians and case managers use those predictions to concentrate interventions among high 
risk patients and those interventions are effective. Then we will reduce treatment interruption 


before we look at how the model is built. It's important to know that this work is possible because of 
the investment in information systems. During the earlier phases of this project, Kenya Emr, the 
Emr. Developed and supported by the project built on open, Mrs. Deployed at more than 2,000 
health sites that provide HIV testing care and treatment services. It includes various modules, 
including for HIV testing services, HIV care, laboratory pharmacy and others. Data from Kenya Emr 
is centralized in a national data warehouse where data is de-identified 


and enables various data use cases and applications, including for program monitoring surveillance 
and machine learning 


for any patient-centered model we develop. We try to consider the patient encounter holistically, 
including the who, the where and the when, the who includes things like patient demographics, 
Clinical history, drug history, and viral load history. 


the where includes attributes of the health facility, such as the level of care and population, level, 
behavioral norms that describe the context in which someone is seeking care. And finally, the when 
it can include things like the day of the week or the month of the encounter 


data on who and when come from the national Data Warehouse via Ken Emr. Data on the where 
comes from a combination of Kenya's master Facility List and open source. Geospatial data sets 
from the Institute of health metrics and evaluation world pop and Meta. 


So the model development process begins with data collected at the national data warehouse. We 
proceed through what is a conventional process in the space of machine learning, including data 
cleaning feature. Engineering feature is just another word that people use to mean variable 


selection of candidate models and machine learning. There isn't 1 machine learning model that we 
trained. But there's families of machine learning models, and we'll explore different ones for the 
same use case. 


will evalu test and evaluate different machine learning models on historical data. 


and then we'll attempt to interpret or try to understand what variables are most important to the 
model. What information the model is primarily using when it makes a determination, for example, 
that a patient is high risk or low risk for treatment interruption. 


At this point the model is integrated into the Emr, which is the domain of our engineering team and 
is no small feat. Their task is to integrate the model so that it works offline, as many of the facilities 
that use can Emr do not have regular Internet access. And this was a point made before by the 
outeray team around designing 


for these realities from the start. Second, the engineering team needs to deploy this model such 
that it generates predictions quickly in a matter of seconds, and finally, that these models are 
deployed in in a lightweight manner, so that they don't adversely slow, affect or slow down the 
general processing of the Emr at site. So, once deployed at the facility clinicians enter data into the 
Emr. They trigger the machine learning prediction process. 


The machine learning results are then displayed in the Emr and clinicians indicate whether and 
how they use the machine learning risk information critically. This information is then stored and 
transmitted back to the National data warehouse to enable our team to monitor how the models are 
performing and being used. 


So now let's look at how clinicians and case managers interact with the outputs of the machine 
learning model. So at the right, you'll see a screenshot of Kenya Emr at the top. The clinician sees 
the model-determined. lit risk category, either high medium or low. In addition to a risk category. We 
also display risk factors, information around what drove the model to categorize the patient as high 
or medium risk, for example, whether the patient was recently iit, or whether they've come for 


appointments without unscheduled. At the bottom of the slide you'll see a line list generated for 
case managers. This list is now sorted by lit risk enabling case managers to more easily prioritize 


patients for intervention. 


The 1st part of our theory of change asserted that we can generate accurate predictions at site. 
Our current IT model was deployed in February of this year, and is doing a good job of 
distinguishing between patients based on lat risk and outcomes those considered high risk by the 
model 


experience an it rate of 13%. Those categorized as medium risk experience, an it rate of 5%. And 
those categorized as low risk experience an it rate of 2% altogether. Those in the high and Median 
risk categories represent around half of patients. But over 80% of the observed. lit. 


our focus now is embedding feedback mechanisms in the Emr. To collect information for the 
second and 3rd clauses of our theory of change, together with our CDC colleagues and service 
delivery partners. We've identified packages of interventions specific to each risk category, 
including for appointment management, assignment of case managers, robust client literacy and 
differentiated service delivery. We already collect information on some of these interventions, and 
are currently working to modify the Emr to include others. Having validated that the model is 
performing as intended at site. 


As we collect this information from clinicians and case managers, we'll better understand how the 
model predictions are used and what impact it has on iit 


last slide. We've a few lessons that we've learned along the way. So first, st it is possible to develop 
and integrate patient-centered models in electronic medical record systems as real time decision 
support, though the challenges are real challenges in creating models using routine data. 


But in some ways bigger challenges on the engineering side of integrating models in the Emr 
system. Because there's relatively little documentation or prior examples to learn from and most 
development that involves deploying machine learning models this day, and age is assuming that 
it's a cloud hosted solution which is not possible here. Given the limitations around Internet access. 
Second, that a machine learning, activity should be framed as part of a theory of change. 


and that the objective and focus should be on impacting clinical outcomes, not only generating 
accurate models. And 3, rd related to the previous point that machine learning activities require a 
whole of team effort, including data scientists, engineers, epidemiologists, quality assurance and 
implementation leads 


guidance and support from our CDC. Colleagues has been essential, and particularly the 
invaluable insights and feedback we get from our service delivery partners who use these tools and 
give us the feedback that we need, that guides us, and how we can make them better. 


With that I'll hand back over and thank you for your time. 


Kirsten Weand 00:51:45 


Great. Thank you so much, Jonathan. Excellent presentation! Very, very interesting. In the essence 
of time 


we'll do a quick turn over to Beth for demo. Beth. 


you're the floor is yours. 


Beth Geoffroy | Dimagi 00:51:59 


Wonderful thanks. So much, Kirsten, and thanks to everyone at Gates Foundation and Ghsd 
demagog is excited to participate today and share more about our platform. Open chat studio 
demagi is committed to advancing equitable use of Al and large language models in particular, to 
drive impact while prioritizing local inclusion and ownership. 


Demagi is a social enterprise founded in 2,002 out of Mit and Harvard. Our mission is to build and 
scale sustainable high impact digital solutions that amplify frontline work and support the 
individuals that deliver those services to the last mile. 


A key pillar of our 5 year. Strategy is to improve jobs to improve outcomes. And our newest 
innovations are furthering that goal. 


We're specifically focusing on solutions to support people living and working in lower and middle 
income countries to ensure that all can benefit from advances in technology and generative Al. 


Large language models are exhibiting language abilities previously only seen from humans, there 
have been massive leaps forward in a computer's ability to show common sense, creativity, 
creativity, empathy. As Audrey talked about earlier and problem solving 


this technology has been transformative. But with such rapid change comes an incredible 
opportunity and enormous risk. 


Our collective challenge is to make sure the adoption of LIms is impactful, equitable, safe and 
inclusive, so that it does not only benefit people in high income countries. 


To that end demagogi has developed a platform called Open Chat Studio, an easy to use open 
source tool to rapidly prototype test and deploy Llm. Solutions for global health and development. 


Open chat studio is situated between a large language model and the end user. 
It can connect to any Llm. With an Api like Gpt, Claude, or Llama. 


The platform enables users to create customized chat bots, leveraging source material as needed, 
and it provides guardrails to ensure safety and accuracy, checking in and outgoing messages from 
both the user and the Llm. Before they're sent. 


It also allows users to then deploy chat bots on easily accessible channels, such as Whatsapp 
telegram SMS or the web. 


It can also capture user consent. It records transcripts, transcripts of interactions. And it uses safety 
layers and input formaters to keep the conversation safe and on track 


you can also have interactions completely over voice. If that's the preferred mode of 
communication for a specific context or in low literacy settings. 


Demandi uses this tool internally, but we also support an partner ecosystem to help others build 
their own bots and identify unique ways in which LIms can advance the work that they are doing. 


Our efforts to date have been driven by a desire to discover the most impactful use cases. 


While many are focused on Q&A bots. Demagi is exploring chatbot-led interactions, such as 
coaching, role playing or interviewing. 


Today, I'm going to share 3 key areas. Demagua is prioritizing in coordination with partners and 
funders. 


First, st I'll show you some direct to client use cases, including how we're evaluating Llm's ability to 
operate in low resource languages. 


Second, and probably demagogi's primary focus internally is how Llms can be used to support 
frontline workers, improve their skills and improve their jobs to hopefully improve the outcomes of 
those that they serve. 


And finally, a quick overview of the work we're doing to create this inclusive ecosystem where Imic 
partners can easily use open chat studio to create their own chat bots. 


After all that I'll share where we hope to go next. 
So 1st up is how Llms can support individuals through direct to client use cases 
for this example and the next I'll talk about multiple micronutrient supplements for pregnant women. 


This 1st client-facing bot is an Mms. Usage tracker to support pregnant women to properly take 
their vitamins during pregnancy. 


the Llm. Brainstorms with women, how they can overcome barriers to usage 


while also tracking key program metrics around uptakers around uptake that program managers 
can use. 


You can see in this client bot interaction on the right that the bot is asking the client how they're 
doing, taking their vitamins. 


and in a few key interactions. You can see the LIm. Use common sense in the way pre in a way 
that previously seems impossible for for computers. 


namely, understanding that taking medicine half the days of the week is 3 or 4 times 


showing empathy for the user's concerns around the side effects they're experiencing and 
remembering what a user says earlier in the conversation to ask better, more directed, follow up 
questions like when it asks what other barriers the woman faces in taking her supplements. In 
addition to the stomach pain she previously mentioned. 


Further with adding, just the text here in orange, speak in a way suitable for SMS. You can 
dramatically change the output of 


the output. And the way the bot interacts over SMS, using things like shorthand spelling and emojis 
like the bus to align with what the user is saying. 


By adding these 3 words, speak in Swahili, you can quickly change how the bot operates. It 
switches from English to Swahili, and it performs quite well out of the box. 


Damagi has done some internal testing with staff and current partner works to assess Gpt's 
performance in a number of African languages 


we found good results for many, though there were several where it formed quite poorly 
highlighting areas for further improvement of these models, or where using specific techniques in a 
Chatbot prompt could potentially improve performance. 


Now, I'll move on to how demand is supporting frontline workers and coaching is a key use case 
we're interested in exploring. 


Demagi is working with partners to take their training materials, country guidance, or whatever 
content they want may want to use and add that to the chat Bot, prompt to develop and deploy 
coaches for frontline workers to improve their knowledge, to build skills through training, to improve 
job performance, help them feel supported and build their resilience 


program. Managers can review transcript data for the Al coaches or use a 3rd bot to summarize all 
transcripts and identify potential areas for improvement, additional training. And so on that way they 
can target their limited time with workers to be more effective supervisors and human coaches. 


Here are some of those coaching techniques | just mentioned, namely, Bot led interactive or 
adaptive quizzing to meet users where they are in their learning journey. 


Case-based role plays to provide guidance on improved interpersonal communication, or how to 
have difficult conversations with clients 


and checking check-in bots to support resilience building for workers. 


Here's an example of a role play coach. Again, using the Mms example from before 


the transcript at the right shows how a bot can pretend to be a pregnant woman struggling to take 
their meds, allowing the frontline worker to practice their skills and improve uptake. 


And you can see how this could be applicable for workers who may need to have difficult 
conversations with ill clients or with clients that might be hesitant to enroll or adhere to lifelong 
medication. 


and again, at the right. It seems that the bot knows common side effects associated with taking 
Mms. And can be creative by being stubborn or apathetic about what the front worker is saying. 


after a few of these back and forth interactions. 


we instructed the bot to get the user some feedback on their approach, asking for 2 examples of 
positive feedback and 2 areas for improvement. 


Here the user was an internal tester at Maggie, who doesn't actually know much about Mms. And 
the Bot, rightly suggested that he was empathetic and encouraging, but could do better in providing 
concrete solutions and sharing detailed information about Mms. 


One other quick coaching use case that Damag is super excited about is the project that we're 
actually starting today, where Damagi is supporting a partner organization in Malawi, where 
community health volunteers are conducting a child immunization campaign. 


After each home visit, the Chv is required to participate in a bot-led coaching session to debrief on 
the visit. 


It uses motivational interviewing techniques to build the volunteers, communication skills and 
capacity. 


And we're planning to evaluate how a typical vaccinator compares to an Al coached vaccinator in 
terms of quality of their interactions with clients and vaccine uptake. 


So | have mentioned how Damagi is using open chat studio internally. But | also want to share 
about how we're working with partners to have similar explorations. 


Demagu has received over 6 million in funding to work with partners to support a wide range of use 
cases in high income and low income contexts, ranging from improving family planning agency in 
Kenya and Senegal to supporting providers in the Us. To have conversations with clients about 
serious illness. 


Demoni has also engaged over 40 organizations providing onboarding sessions and user accounts 
for open chat studio so that they can provide feedback to us on the platform. 


They're building bots with or without our help. They're piloting these internally and externally in 


controlled environments. 


And these orders are self funding the work or receiving joint funding together. And we're on track to 
deploy select bots with actual users later this year. 


And finally, our focus to date has been on primarily on evaluation of these tools in controlled 
environments, to effectively measure their utility, safety, accuracy, and adherence to purpose, and 
to build an evidence base and a library of priority use cases in Imic contexts 


critical to this is building capacity to create and leverage these tools in contexts where our global 
development partners hope to use them so that they are context, appropriate function well, in low 
resource languages. 


Finally, we're keen to highlight the risks and biases 


as they appear, so that the next generation of models can be improved to benefit everyone more 
equitably. 


Thanks very much. 


Kirsten Weand 01:02:09 


Thank you so much, Beth, really fascinating on the work, especially for improving access and 
equity. For large language models to to harder to reach or low middle income populations in rare 
languages. 


Finally, we're gonna turn the last 10 min to William Wu from Qed, who will talk about some 
computer vision techniques that they're working on. William over to you. 


2 William Wu [QED.ai] 01:02:34 


Thank you very much. So it's a great pleasure to speak to this group today. My name is William. I'm 
chief executive for Qed 


or quantitative engineering design. And we're going to be presenting scan form to you today. So 
this is a really simple technology. It applies computer vision. And Al to reduce the burden of data 
entry and data analysis and global health. 


So I'm just gonna share my screen with you. Just a little bit about me. So I've been working on 
technology in Africa over the past decade. Living 


countries like Tanzania and Kenya and Malawi. Today l'm calling you from Galunde and Cameroon. 


So first, st | want to explain our motivation for this technology. We all acknowledge that Al has 
tremendous potential to help Pepfar and to help global health in general. But 


you know, there's this law of equivalent exchange. Al is not magic that can produce something out 
of nothing 


for it to produce something useful. We need to feed it with high quality, complete and timely data, 
ideally, a national scale. And | think that's often been the struggles. How do you get that data. 


So, for example, we would like to use Al to say automatic outbreak identification when the next 
pandemic comes through, can we automatically detect the anomalies and find it. Could we 
automatically tailor our HIV strategies to the needs of each country which are always changing. 


But in practice 


you can't actually do that. If data is not flowing in, we can't detect it unless we have this constant 
stream of digital data. And often the digital data is only being captured at high resource sites, and 
then we can't develop an equitable strategy if all the low resource sites are going to be left behind. 


So that's | would like to give you a metaphor that we've been throwing around in our company. So 
imagine you hired this 3 star Michelin chef that can make fantastic food 


but for materials. 


He's just using some rotten expired materials from a garbage can, and he doesn't have the proper 
materials. So it also reminds me of a phrase from Sherlock Holmes data data. | can't make bricks 
without clay. 


So just as the chef needs high quality ingredients, Al needs high quality data to get meaningful 
results. 


And then we could ask the question, how many Pepfar countries do we know which have complete 
E 1st data at national scale. 


there aren't that many? As as as far as we know, in terms of working with all these countries. | 
haven't found one where you can actually get it at national scale. Yet because many of them are 
struggling with really basic needs and electricity and Internet computer equipment, declining funds 
and and reduced staff. 


So a lot of health facilities and communities like the ones in this picture, they just don't have data to 
represent them. But | don't think there should be an excuse for this, so a little bit about myself, | 
used to work at the NASA jet propulsion lab before going to Africa as a telecommunications 
engineer. And so my job was to get data from places like Mars and Venus that stretches beyond 
the solar system, and even the surface of comets. So if we can get data accurately from those 
kinds of places, why can't we get this data on Earth? 


So our solution to that is scan form. And what scan form does is a very simple solution, 
automatically extracts, automatically analyzes and automatically performs data quality assessment 
on handwritten data from paper 


in mere seconds, from anywhere. And I'd like to show you the 4 steps of this and do a quick demo. 
So step one is writing on paper. 


and this is something everyone already knows how to do. Nurses write on regular paper with a 
regular pen. It can be printed in black and white. And for most people using scan form, that's the 
end of it. They just write it once and move on. 


And the step 2 is to take a regular android smartphone. 


and only a few people need to use this, say supervisors. At the end of the day they take a photo of 
the newly completed forms. From that day, taking a photo only takes a few seconds. It's about 
300 kB. 


So it's really easy, and you may only need one or 2 smartphones for a whole health facility. 


Then step 3, step 4 are fully automatic. In in about 3 seconds you can get all the data. It's ina 
database. It can be exported to excel. We can provide Apis, and we can also compute automatic 
analytics and data quality assessments that get sent to dhis 2 open Mrs. And other systems. So I'd 
like to do a very quick live demo for you right now. 


And so this is this is my phone right here. 

Hope you guys can see that. 

So I'm going to click on the scan form app. 

And as you can see, you have a little camera icon right here. 

So just a normal, cheap android phone. I'm just gonna rotate my screen for the 
people at home. 


And as you can see here, | have a HIV testing registered just as a demo. So I'm going to take a 
picture of it. It's got 10 patients 


each patient is coming in different days of the week. The different access points, as you can see. If 
there's a sex, pregnancy or age. Have they taken Arb? Is there, HIV status. If 1, testing results, a 
typical kind of testing register. So again, your your use case may look different. It doesn't have to 
look like this one, | say, took a bad picture. It doesn't allow me to take it, so | have to get the whole 


paper in the pit in the picture. 


And so it's that simple. Just within about 3 seconds you're taking a photo. So this is what the 
nurses are doing each day. And also at the end of the month they could be 


capturing thousands of records in seconds just like this. 


So since I'm currently connected to the Internet, it's already uploaded, | think, to the server. But if | 
didn't have Internet, | would also be fine, because the system will cache the data and then refresh 
send it automatically whenever there is mobile network. So everything we do does not rely on 
stable electricity or internet. 


So you can see there's the records have already been uploaded. So | happen to be currently 
connected. Just to show you one quick example of how this works. There was a mention of using 
Al cautiously. So there's about 87 questions on the page, and it only asks for verification for a few 
questions. So usually over 98 of the workload is done, and these are considered just fine. And so 
then you can also see all the work that the Al has done, and so for every field, you can see what 
the human has written and what the Al has guessed. 


So this is taking away about 98% of the burden in doing data extraction. And what's more, not only 
are we getting this data automatically, but the analytics are done automatically. So if you refresh 
this board recall, | took 3 photos. There are 10 patients per page. So | should see about 30 
patients. 


Yes. And so that's 


almost what we have. | say, there's 1 1 that was missing for so okay, maybe an HIV test result was 
not filled out. So you have 30 records 7, positive, 22 or negative. | can see on what days of the 
week they arrived. | have page numbers. So any pages missing in a register, we automatically 
detect it and send them a notification in 12 h. 


And, what's more, all of these very mind numbing statistics, like calculating positivity, negativity, 
recency by various disaggregates, pregnancy, and circumcision. This is all done automatically. And 
healthcare workers don't need to do this accounting anymore. 


So that's a quick demo and moving to my next slide here. 
Oops. Sorry. 


So Scanthorne has been deployed across a very wide variety of use cases all the different parts of 
a health facility, the different words from maternal child. Health, HIV malaria, everything you see on 
the left. We've converted to scan from one country or another. We've deployed it in about 11 
different countries. 


Our biggest deployments today are in Malawi for all HIV testing national scale. Also, Western 
Kenya and the malaria and Mch programs. I'm currently in Cameroon. And we're launching 
community national scale HIV testing for pediatrics. 


And so basically, anything that can be written on paper. You can convert to scan form really easily. 


We've won a few awards for our work | wanted to mention. So mit solve 2,022 best technology for 
low to middle low and middle income countries for healthcare. We're the Who Hds guidelines and 
the appendix is a model, for example, model for countries to follow, and we won award from Aws 
for health equity, promoting health equity across countries. And we're participating in many 
conferences, as you can see on the right. Be happy to see you there. 


and 
these are just a few slides from our work in Malawi 


in just 19 months. Starting from nothing, we have 7.6 million HIV testing records as well as hepatitis 
B and syphilis. We think it might be one of the largest, if not the largest, archives of really clean 
electronic HIV data in Africa, and one of the fastest scaling. So in just, you know, about a year 
covering over 90% of all the access points in the country, including the community. 


So you can produce analytics like this, which previously 


was, we're not able to generate without this Al in over 20 years, every access point, getting data 
about positivity, yield and test counts. Even in the community, and updating these kinds of graphs 
every single day without requiring huge amounts of money in m and E. 


So not only does that information analysis go to national level, but we also send it back to the sites. 
So the sites get the same statistics at site level site level reports. And they also get data quality 
analysis reports so they can fix the data themselves retake the photo and just constantly improve. 
And they save a lot of time like | mentioned about 30% of the time 25% of the time, because there's 
over 16 reports and 600 something indicators for Pepfar that are automatically calculated and just 
sent to dhis 2. So you don't need to worry about it. 


So we really believe in using the right tool for the right places. 


So here I've shown 2 examples on left hand side, maybe 75% of the facilities we see look 
something like this, no connectivity, limited power, leaking roofs, no security. So emrs can be really 
difficult to run there. And so for those kind of places, maybe scamples were appropriate. But there 
are also other places where Emr can work a bit better. But no matter what every country we see, 
you need a hybrid solution where you can use both 


both paper and electronic solutions. 


So a quick slide about interoperability. Security. We export date in all different kinds of formats. We 
have integrations with each is to open arsk and open him, and we adhere to many security 
standards in every country you work with, we wanna adhere to the National Data Protection Act. 
We've also been audited by a Gdpr. FDA. Part 11 and pass those tests. 


Hosting can be done locally. It can be done in the cloud. Ownership is definitely the Moh owns the 
system and all the data, and we have many security measures like encryption at rest and then 
transit. And we can also make the photos automatically evaporate. 


So about the costing we also want this to be sustainable. So all our graphs, that's something like 
this. This is an example from Malawi of declining costs. So the cost for investment. This is about, or 
amount 100 times less than what has been invested before. To try to get this kind of electronic 
data. And we want the governments to be able to take on these costs on their own, even with 
declining donor funds. 


So to close. | want to talk about this analogy of sustainability and acceleration. Sometimes we think 
there's a trade off as if these 2 are on the seesaw where you want to sustain existing systems. But 
then it's really hard to accelerate and grow. And if you want to have both 


it would be nice if you had a solution that was very simple. They have to be low cost, and they have 
to be able to integrate with other systems so that we can maximize our return on investment for 
existing systems, and we hope that maybe you're convinced that scan form is an example solution 
that uses Al to lift the whole board rather than choosing one or the other such that every site can 
have fully electronic data. 


So that's our presentation. Thank you very much. 


Guillaume Chabot-Couture 01:14:34 
Thank you, William Beth and Jonathan, for the very detailed presentation this whirlwind 
view into Al. This 


is or entry into the discussion sections we have about 10 ish minutes for a discussion. | wanted to 
kick it off, maybe with a question to you, Beth. On on large language model. Specifically. 


it's it's amazing that these large language models can talk right and essentially can interact ina 
way that feels human. Yet they are very much, not human. We. We started the auto race section 
with 


a question to the audience, ask, Can what can Al do? And what should humans do. What is your 
perspective on the risk that as these Al's get better and better, we interact with them as though they 
are humans. But they're very much not. And what risk do you see 


in this context for people who are more vulnerable, for whom that difference might not be so 
obvious? We seek to, to reach vulnerable populations with Al to up to give them services, or to to 
help them. But for you, what is the risk that an Al could pose when it's perceived to be more human 
than it should have. And how do you think about this balance of empathy approachability. But still, 


you know, being clear that they're interacting with the computer. 


Dp Beth Geoffroy | Dimagi 01:16:06 


That's a great question, and thanks for asking it. | think it is really easy to think that these tools are 
human, like, you know, even when you're having a conversation with it yourself, the way it 
responds to the way it understands what you're saying, even when you don't necessarily make 
sense. It's really easy to to think that. And | think that's the huge benefit of it, right? There is no 
other tool that can do that. So you really have to balance the utility of that human like nature 


with the risks that you talked about, | think can telling people when they're when they're putting this 
chat bot on Whatsapp, that this is not a human. This is a chat Bot, reminding that them of that 
throughout the conversation. 


directing them to a clinical professional. If they raise questions that should not be answered by a 
Chatbot. 


making sure you have a human in the loop, depending on what context or what use case? You're 
working in is super important. And then, yeah, continuously reminding them that they're a chatbot 
and usually Gpt will will often say I'm a Chatbot, and | can't do that. And | think, having that 
sentiment in there is really important. But again, you have to balance like the benefit of something 
that you want to have a conversation with, that doesn't feel stilted with the risks that that may come 
up. 


Guillaume Chabot-Couture 01:17:18 


Thank you so much. And maybe | can ask one more question and pass it over to my colleague 
Chris, and one of the issues that we often face in lowy. So setting is that there's very little data to 
work with. And 


you know, how do we make data more representative in this in this context for large language 
models. And Al is a real and big question. Curious what you all as panelists think about this, and 
how you would 


significantly improve the situation, so that in next 5 to 10 years we can have parity between low 
research setting, higher research settings. 


Beth Geoffroy | Dimagi 01:17:55 


Yeah, also great question. | think that's pi part of why we're so interested in this language. 
Exploration to really see, can these function in low resource languages at all can they make sense? 
It may not be a perfect direct translation, but if it's enough where it can make sense and provide 
utility to the user. That's super important. The the models. If you're using our Api with open open AI, 
the models are not being trained on the data 


that we're submitting but continuously flagging issues, areas of bias areas where the the large 
numbers model is not performing. Well, that's really where we're at. We're at an evaluation stage. 
We want to see how useful these can be and where that they're performing really poorly is a hugely 


important outcome, so that we know, like, what are the best cases right now? What might be the 
best case? Best use cases in 6 months 


or 9 months. Once these models have changed, and | think one thing | didn't mention was the 
importance of open source models right now. None of the open source models we've tried. 
Compare in terms of functionality and abilities 


to the Gpt, the llama. And so we are using those proprietary models. But | think once we connect 
are able to leverage open source models. People have a lot more control over their data. They'll be 
able to lock down the model where it's at right now, and be able to use these tools more 
confidently, knowing sort of what data it's been trained on, or Knowing where their data is. Gonna 


go. 


Guillaume Chabot-Couture 01:19:19 


Thank you, Beth. Maybe Jonathan and William. If you have a few thoughts on that as well. 


Jonathan Friedman 01:19:28 


Sure. so | will speak to the Alm side. | think | think that's covered that | think there's it's it's there's a 
ways to go to trying to. It may not be immediately achievable, or even 


a goal to prioritize, to make more representative local LI, locally local LIms as opposed to the kind 
of our infrastructure that Beth is talking about, where you're we're trying to have some kind of 
intermediate layer between the user and an existing Llm. To do some of that conversion for you. 
Our presentation. | mean this matters a lot also in the area of machine learning 


training patient center models. On data collected from from by Emr, we have, you know, many sites 
that report through our Emr, we have many sites that don't among the sites that do our team 


worked very hard to to get the report. Get the reporting rates and completeness as near 100 as it 
can be, but it's never perfect. And so these are also things that we have to think about very, very 
carefully understand. Whose data are we accessing? Whose data are we not accessing by the 
Emr? But we deploy these models. What gaps might we have? And and have to make proper 
adjustments that that could be a whole other whole other conversation. But just to say that it very 
much impacts how we approach the development of patient centered models as well. 


2 William Wu [QED.ai] 01:20:46 


Sorry | might have missed. The question was about local development of models. 


Guillaume Chabot-Couture 01:20:50 


It was Pr, starting about data and representativity of of data. | mean, William, we've talked a lot 
about improving the data quality. How do you? How how tall is that mountain to climb for the 
development sector? 


2 William Wu [QED.ai] 01:21:04 


Uhhuh, yeah, | think a lot of expenditure has been spent and trying to improve data quality 
manually. So this we take a small subset of sites. We have a whole separate 


Work, stream and budget to go out there and and inspect records manually and fix them. So we try 
to use a technology and Al to make that feedback loop. Continuous continuous quality 
improvement. That the healthcare worker can handle themselves by just talking back and forth with 
Al until the data quality improves and they can see the the data quality errors going down. 


Another thing that we do is we make sure that 


everything's as localized as possible. So when we use Al in the various countries, we actually 
calibrate our models based on how people write in each country. And so that makes it a lot less 
likely for Al to make the wrong decision. And so data quality also improves that way. 


Kirsten Weand 01:21:59 


Great thanks, William. I'm gonna ask a final question here, probably cause we have about 5 min 
left. So a last question that picks up on some of the unanswered questions here in the chat. But for 
those that are interested in other questions that were asked, there's a lot that have been that has 
been going on in the the QA. So if you do, wanna check out what other questions people have 
been asking and answer for those. Please look there. 


But a question for each of the panelists, panelists here, you know, we've talked about. Some of the 
technical challenges that we see. With the question just asked on on data, availability 


on things, in in optimizing these different Al technologies and tools. But a question for all of you, 
since there's been a lot of piloting and and real world implementation of these? Or what challenges 
have you run into more from this in the enabling environment. So there's a question in the chat 
about 


data, sovereignty, data, privacy. What that looks like. But in the policy or enabling environment. In 
the countries that you've worked in with the partners you've worked with. What are some of the big 
challenges that you see that you think we need to overcome in order to allow these Al tools to be 
adopted at pilots at Pilot, but also more importantly at scale for maximum impact. 


So, Jonathan, | think I'll turn to you 1st if you want to take a stab at that. 


Jonathan Friedman 01:23:17 


Sure. Yeah, that's a such a huge question. 


| think there are rules around data residency that apply to in Kenya. So we are careful not to. Send 
any of our data through our machine learning operations or Al outside of Kenya. And Kenya does 
not have you know the big Aws the Big Cloud providers are are not here yet. Aws, we hope, will be 
coming soon, and and Microsoft also has announced plans to develop a data center, a little bit 
outside Nairobi. And so hopefully. 


hopefully in the years to come, that's the situation is gonna change. But right now we need to think 
about training models and hosting models and ways in which we can't take advantage of of cloud 
compute 


and so that affects what kinds of models we, we try to train. We're also talking about offline 
deployments, deployments where the models are completely at site. And so we also need the 
models that we deploy to take up very little memory and have a very low footprint. And so that also 
those kinds of issues affect 


decisions that we make at the very start of our machine learning process. Around what kind of 
models should we consider? What kind of models can we package in a lightweight offline 
environment. And so we try to take a very practical perspective from the start. | can also say this 
wasn't the main focus. But 


we're also thinking through some of these issues with our HIV informational Chatbot, where initial 
versions and l'm curious for Beth. Your perspective. | don't want to put you on the spot, but initial 
versions where we are building, using an openai or engine, you could say underneath. But then we 
also 


trying to be careful around passing any kind of anything that would be considered health data, for 


example, to an Al bot setting that data outside the country. And so these are issues that we're 
thinking through. And I'd be curious what the other panelists think about that as well. 


Dp Beth Geoffroy | Dimagi 01:25:06 


Yeah, maybe | can just respond quickly. Oh, sorry. 


Kirsten Weand 01:25:09 


Go for it, Beth, go for it. 


D Beth Geoffroy | Dimagi 01:25:10 


Yeah, just to go back to open chat studio is an open source tool. So if you want to take it and run it 
on your own servers. We're more than happy for you to do that. Demagni also wants the users to 


be in control of their data, so we don't control the data, you all have access to your data. And then 
again, the more we can move towards open source 


Lims, the more control countries or partner organizations will have over the data. That that's being 
transmitted. 


Kirsten Weand 01:25:39 


Great. Thank you and William. Any thoughts on this. 


2 William Wu [QED.ai] 01:25:43 


Yes, | want to echo the point for the previous speaker about the data center. So there's many 
places, very limited data centers and limited infrastructure. And | think there's also some lack of 
understanding about it. So | think the cloud could be very helpful to unlock the power of Al at an 
affordable cost. But few people seem to understand what it means like, once | was talking with a 
minister, so if the data is in the cloud and starts raining. Are you telling me that I'm going to start 
losing my data? 


And so there's kind of education necessary. And | was thinking that when we implement health 
programs. 


you get doctors, nurses, public health professionals, implement a lot of that work you're doing 
surgery. You get a a great surgeon to do it. And we need to have a similar approach for informatics. 
And Al. We should bring some of the best mathematicians, computer scientists, software engineers 
to the table. We want to get the best results for a Al and public public health. So that's 1 of my 
suggestions for changing a little bit of how about how the way we work? Thank you. 


Kirsten Weand 01:26:43 


Awesome. Thank you. | definitely love that point on on workforce education and and raising 
awareness about what things are and are not, and considerations for things, especially in the 
context of Al and these emerging, evolving, quickly advancing technology spaces. 


Alright, we're we're just at about time here. Everybody. | really appreciate all the wonderful 
presentations today from everybody. It's been enlightening, incredibly informative, and | think has 
spurred a lot of discussion. I'd also like the like to thank the participants here for joining today. At 1 
point we had over 300 people. So it is. It is a popular topic. 


And glad to to stimulate some of this discussion. As we noted up top. We're looking to have 
additional listening sessions like this in the future. And so we have a quick question for the group 
here, just to get a sense of what additional topics on Al would be an interest to you in the future. To 
bring in 


new speakers and and keep this engagement going. So | believe 


A poll will be open here in just a moment where you can go ahead and add that in add your 
thoughts in there. Thank you. There it is. Feel free. You have up to 500 characters, so it doesn't 
have to be short or can be very short, whichever you please. But again, just thank you to all of our 
speakers. Thank you to the Gates Foundation for partnering on these listening sessions and co- 
hosting and thank you to all the participants here today. 


So with that | will let everybody answer the question. So we'll leave. Leave the session open for 
just a couple of minutes, so that we can get some of that feedback on the poll, and with that | also 
wish you all the a good rest of your day or good evening, wherever you are, and we will look 
forward to engaging again soon. 


Thank you. 


